home *** CD-ROM | disk | FTP | other *** search
- CHAPTER 8 NUMBERS AND EXPRESSIONS 8-1
-
- Numbers and Bases
-
- A86 supports a variety of formats for numbers. In non-computer
- life, we write numbers in a decimal format. There are ten
- digits, 0 through 9, that we use to describe numbers; and each
- digit-position is ten times as significant as the position to its
- right. The number ten is called the "base" of the decimal
- format. Computer programmers often find it convenient to use
- other bases to specify numbers used in their programs. The most
- commonly-used bases are two (binary format), sixteen (hexadecimal
- format), and eight (octal format).
-
- The hexadecimal format requires sixteen digits. The extra six
- digits beyond 0 through 9 are denoted by the first six letters of
- the alphabet: A for ten, B for eleven, C for twelve, D for
- thirteen, E for fourteen, and F for fifteen.
-
- In A86, a number must always begin with a digit from 0 through 9,
- even if the base is hexadecimal. This is so that A86 can
- distinguish between a number and a symbol that happens to have
- digits in its name. If a hexadecimal number would begin with a
- letter, you precede the letter with a zero. For example, hex A0,
- which is the same as decimal 160, would be written 0A0.
-
- Because it is necessary for you to append leading zeroes to many
- hex numbers, and because you never have to do so for decimal
- numbers, I decided to make hexadecimal the default base for
- numbers with leading zeroes. Decimal is still the default base
- for numbers beginning with 1 through 9.
-
- Large numbers can be given as the operands to DD, DQ, or DT
- directives. For readability, you may freely intersperse
- underscore characters anywhere with your numbers.
-
- The default base can be overridden, with a letter or letters at
- the end of the number: B or xB for binary, O or Q for octal, H or
- R for hexadecimal, and D or xD for decimal. Examples:
-
- 077Q octal, value is 8*7 + 7 = 63 in decimal notation
- 123O octal, if the "O" is a letter: 64 + 2*8 + 3 = 83 decimal
- 1230 decimal 1230, showing why you should use "Q" for octal!!
- 01234567H large constant
- 0001_0000_0000_0000_0003R real number specified in hexadecimal
- 100D superfluous D indicates decimal base
- 0100D hex number 100D, which is 4096 + 13 = 5009 in decimal
- 0100xD decimal 100, since xD overrides the default hex-format
- 0110B hex 110B, which is 4096 + 256 + 11 = 5263 in decimal
- 0110xB binary 4+2 = 6 in decimal notation
- 110B also binary 4+2 = 6, since "B" is not a decimal-digit
-
- The last five examples above illustrate why an "x" is sometimes
- necessary before the base-override letter "B" or "D". If that
- letter can be interpreted as a hex digit, it is; the "x" forces
- an override-interpretation for the "B" or "D". By the way, the
- usage of lower-case for x and upper-case for the following
- override-letter is simply a recommendation; A86 always treats
- upper-and lower-case letters equivalently.
- 8-2
- The RADIX Directive
-
- The above-mentioned set of defaults (hex if leading zero, decimal
- otherwise) can be overridden with the RADIX directive. The RADIX
- directive consists of the word RADIX followed by a number from 2
- to 16. The default base for the number is ALWAYS decimal,
- regardless of any (or no) previous RADIX commands. The number
- gives the default base for ALL subsequent numbers, up to (but not
- including) the next RADIX command. If there is no number
- following RADIX, then A86 returns to its initial mixed-default of
- hex for leading zeroes, decimal for other leading digits.
-
- For compatibility with IBM's assembler, RADIX can appear with a
- leading period; although I curse the pinhead-designer who put
- that period into IBM's language.
-
- As an alternative to the RADIX directive, I provide the D-switch,
- which causes A86 to start with decimal defaults. You can put +D
- into the A86 command invocation, or into the A86 environment
- variable. The first RADIX command in the program will override
- the D switch setting.
-
- Following are examples of radix usage. The numbers in the
- comments are all in decimal notation.
-
- DB 10,010 ; produces 10,16 if RADIX was not seen yet
- ; and +D switch was not specified
- RADIX 10
- DB 10,010 ; produces 10,10
- RADIX 16
- DB 10,010 ; produces 16,16
- RADIX 2
- DB 10,01010 ; produces 2,10
- RADIX 3 ; for Martian programmers in Heinlein novels
- DB 10,100 ; produces 3,9
- RADIX
- DB 10,010 ; produces 10,16
-
-
- 8-3
-
- Floating-point Initializations
-
- A86 allows floating-point numbers as the operands to DD, DQ, and
- DT directives. The numbers are encoded according to the IEEE
- standard, followed by the 8087 and 80287 coprocessors. The
- format for floating-point constants is as follows: First, there
- is a decimal number containing a decimal point. There must be a
- decimal point, or else the number is interpreted as an integer.
- There must also be at least one decimal digit, either to the left
- or right of the decimal point, or else the decimal point is
- interpreted as an addition (structure-element) operator.
- Optionally, there may follow immediately after the decimal number
- the letter E followed by a decimal number. The E stands for
- "exponent", and means "times 10 raised to the power of". You may
- provide a + or - between the E and its number. Examples:
-
- 0.1 constant one-tenth
- .1 the same
- 300. floating-point three hundred
- 30.E1 30 * 10**1; i.e., three hundred
- 30.E+1 the same
- 30.E-1 30 * 10**-1; i.e., three
- 30E1 not floating-point: hex integer 030E1
- 1.234E20 scientific notation: 1.234 times 10 to the 20th
- 1.234E-20 a tiny number: 1.234 divided by 10 to the 20th
-
-
-
- Overview of Expressions
- -------- -- -----------
-
- Most of the operands that you code into your instructions and
- data initializations will be simple register names, variable
- names, or constants. However, you will regularly wish to code
- operands that are the results of an arithmetic calculation,
- performed either by the machine when the program is running (for
- indexing), or by the assembler (to determine the value to
- assemble into the program). A86 has a full set of operators that
- you can use to create expressions to cover these cases:
-
- * Arithmetic Operators
- byte isolation and combination (HIGH, LOW, BY)
- addition and subtraction (+,-)
- multiplication and division (* , /, MOD)
- shifting operators (SHR, SHL, BIT)
-
- * Logical Operators
- (AND, OR, XOR, NOT)
-
- * Relational Operators
- (EQ, LE, LT, GE, GT, NE)
- 8-4
- * Attribute Operators/Specifiers
- size specifiers (B=BYTE,W=WORD,F=FAR,SHORT,LONG)
- attribute specifiers (OFFSET,NEAR,brackets)
- segment-addressing specifier (:)
- compatibility operators (PTR,ST)
- built-in value specifiers (TYPE,THIS,$)
-
- * Special Data Duplication Operator
- (DUP) --see Chapter 9 for a description
-
-
- Types of Expression Operands
- ----- -- ---------- --------
-
- Numbers and Label Addresses
-
- A number or constant (16-bit number) can be used in most
- expressions. A label (defined with a colon) is also treated as
- a constant and so can be used in expressions, except when it is a
- forward reference.
-
-
- Variables
-
- A variable stands for a byte- or word-memory location. You may
- add or subtract constants from variables; when you do so, the
- constant is added to the address of the variable. You typically
- do this when the variable is the name of a memory array.
-
-
- Index Expressions
-
- An index expression consists of a combination of a base register
- [BX] or [BP], and/or an index register [SI] or [DI], with an
- optional constant added or subtracted. You will usually want to
- precede the bracketed expression with B, W, or F; to specify the
- kind of memory unit (byte, word, or far-pointer) you are
- referring to. The expression stands for the memory unit whose
- address is the run-time value(s) of the base and/or index
- registers added to the constant. See the Effective Address
- section and the beginning of this chapter for more details on
- indexed memory.
-
-
- Arithmetic Operators
- ---------- ---------
- HIGH/LOW
-
- Syntax: HIGH operand
- LOW operand
-
- These operators are called the "byte isolation" operators. The
- operand must evaluate to a 16-bit number. HIGH returns the
- high-order byte of the number; LOW the low-order byte.
-
- For example,
-
- MOV AL,HIGH(01234) ; AL = 012
- TENHEX EQU LOW(0FF10) ; TENHEX = 010
- 8-5
- These operators can be applied to each other. The following
- identities apply:
-
- LOW LOW Q = LOW Q
- LOW HIGH Q = HIGH Q
- HIGH LOW Q = 0
- HIGH HIGH Q = 0
-
-
- BY
-
- Syntax: operand BY operand
-
- This operator is a "byte combination" operator. It returns the
- word whose high byte is the left operand, and whose low byte is
- the right operand. For example, the expression 3 BY 5 is the
- same as hexadecimal 0305. The BY operator is exclusive to A86.
- I added it to cover the following situation: Suppose you are
- initializing your registers to immediate values. Suppose you
- want to initialize AH to the ASCII value 'A', and AL to decimal
- 10. You could code this as two instructions MOV AH,'A' and MOV
- AL,10; but you realize that a single load into the AX register
- would save both program space and execution time. Without the BY
- operator, you would have to code MOV AX,0410A, which disguises
- the types of the individual byte-operands you were thinking
- about. With BY, you can code it properly: MOV AX,'A' BY 10.
-
-
- Addition (combination)
-
- Syntax: operand + operand
- operand.operand
- operand PTR operand
- operand operand
-
- As shown in the above syntax, addition can be accomplished in
- four ways: with a plus sign, with a dot operator, with a PTR
- operator, and simply by juxtaposing two operands next to each
- other. The dot and PTR operators are provided for compatibility
- with Intel/IBM assemblers. The dot is used in structure-field
- notation; PTR is used in expressions such as BYTE PTR 0. (See
- Chapter 12 for recommendations concerning PTR.)
-
- If either operand is a constant, the answer is an expression with
- the typing of the other operand, with the offsets added. For
- example, if BVAR is a byte variable, then BVAR + 100 is the byte
- variable 100 bytes beyond BVAR.
-
- Other examples:
- DB 100+17 ; simple addition
- CTRL EQU -040
- MOV AL,CTRL'D' ; a nice notation for control-D!
- MOV DX,[BP].SMEM ; --where SMEM was in an unindexed structure
- DQ 10.0 + 7.0 ; floating-point addition
- 8-6
- Subtraction
-
- Syntax: operand - operand
-
- The subtraction operator may have operands that are:
-
- a. both absolute numbers
- b. variable names that have the same type
-
- The result is an absolute number; the difference between the two
- operands.
-
- Subtraction is also allowed between floating-point numbers; the
- answer is the floating-point difference.
-
-
- Multiplication and Division
-
- Syntax:
- Multiplication: operand * operand
- Division: operand / operand
- Modulo: operand MOD operand --(absolute operands only)
-
- You may only use these operators with absolute or floating-point
- numbers, and the result is always the same type. Either operand
- may be a numeric expression, as long as the expression evaluates
- to an absolute or floating-point number. Examples:
-
- CMP AL,2 * 4 ; compare AL to 8
- MOV BX,0123/16 ; BX = 012
- DT 1.0 / 7.0
-
-
-
- Shifting Operators
-
- Syntax: Shift right: operand SHR count
- Shift left: operand SHL count
- Bit number: BIT count
-
- The shift operators will perform a "bit-wise" shift of the
- operand. The operand will be shifted "count" bits either to the
- right or the left. Bits shifted into the operand will be set to
- 0.
-
- The expression "BIT count" is equivalent to "1 SHL count"; i.e.,
- BIT returns the mask of the single bit whose number is "count".
- The operands must be numeric expressions that evaluate to
- absolute numbers. Examples:
-
- MOV BX, 0FACBH SHR 4 ; BX = 0FACH
- OR AL,BIT 6 ; AL = AL OR 040; 040 is the mask for bit 6
- 8-7
-
- Logical Operators
- ------- ---------
- Syntax: operand OR operand
- operand XOR operand
- operand AND operand
- NOT operand
-
- The logical operators may only be used with absolute numbers.
- They always return an absolute number.
-
- Logical operators operate on individual bits. Each bit of the
- answer depends only on the corresponding bit in the operand(s).
-
- The functions performed are as follows:
-
- 1. OR: An answer bit is 1 if either or both of the operand bits
- is 1. An answer bit is 0 only if both operand bits are 0.
-
- Example:
-
- 11110000xB OR 00110011xB = 11110011xB
-
-
- 2. XOR: This is "exclusive OR." An answer bit is 1 if the operand bits are
- different; an answer bit is 0 if the operand bits are the
- same. Example:
-
- 11110000xB XOR 00110011xB = 11000011xB
-
-
- 3. AND: An answer bit is 1 only if both operand bits are 1. An
- answer bit is 0 if either or both operand bits are 0.
- Example:
-
- 11110000xB AND 00110011xB = 00110000xB
-
- 4. NOT: An answer bit is the opposite of the operand bit. It
- is 1 if the operand bit is 0; 0 if the operand bit is 1.
- Example:
-
- NOT 00110011xB = 11001100xB
-
-
- Relational Operators
- ---------- ---------
- Syntax:
- equal: operand EQ operand
- not equal: operand NE operand
- less than: operand LT operand
- less or equal: operand LE operand
- greater than: operand GT operand
- greater or equal: operand GE operand
- 8-8
- The relational operators may have operands that are:
-
- a. both absolute numbers
- b. variable names that have the same type
-
- The result of a relational operation is always an absolute
- number. They return an 8-or 16-bit result of all 1's for TRUE
- and all 0's for FALSE. Examples:
-
- MOV AL, 3 EQ 0 ; AL = 0 (false)
- MOV AX, 2 LE 15 ; AX = 0FFFFH (true)
-
-
- Attribute Operators/Specifiers
- --------- --------------------
- B,W,D,Q,T memory-variable specifiers
-
- Syntax: B operand Q operand
- operand B operand Q
- W operand T operand
- operand W operand T
- D operand
- operand D
-
- B, W, D, F, Q, and T convert the operand into a byte, word,
- doubleword, far, quadword, and ten-byte variable, respectively.
- The operand can be a constant, or a variable of the other type.
- Examples:
-
- ARRAY_PTR:
- DB 100 DUP (?)
- WVAR DW ?
- MOV AL,ARRAY_PTR B ; load first byte of ARRAY_PTR array into AL
- MOV AL,WVAR B ; load the low byte of WVAR into AL
- MOV AX,W[01000] ; load AX with the memory-word at loc. 01000
- LDS BX,D[01000] ; load DS:BX with the doubleword at loc. 01000
- JMP F[01000] ; jump far to the 4-byte location at 01000
- FLD T[BX] ; load ten-byte number at [BX] to 87 stack
-
-
- For compatibility with Intel/IBM assemblers, A86 accepts the more
- verbose synonyms BYTE, WORD, DWORD, FAR, QWORD, and TBYTE for
- B,W,D,F,Q,T, respectively.
-
-
- SHORT and LONG operators
-
- Syntax: SHORT label
- LONG label
-
- The SHORT operator is used to specify that the label referenced
- by a JMP instruction is within 127 bytes of the end of the
- instruction. The LONG operator specifies the opposite: that the
- label is not within 127 bytes. The appropriate operator can (and
- sometimes must) be used if the label is forward referenced in the
- instruction.
- 8-9
- When a non-local label is forward referenced, the assembler
- assumes that it will require two bytes to represent the relative
- offset of the label. By correctly using the SHORT operator, you
- can save a byte of code when you use a forward reference. If the
- label is not within the specified range, an error will occur.
- The following example illustrates the use of the SHORT operator.
-
- JMP FWDLAB ;three byte instruction
- JMP SHORT FWDLAB ;two byte instruction
- JMP >L1 ; two byte instruction assumed for a local label
-
- Because the assembler assumes that a forward-reference local
- label is SHORT, you may sometimes be forced to override this
- assumption if the label is in fact not within 127 bytes of the
- JMP. This is why LONG is provided:
-
- JMP LONG >L9 ; three byte instruction
-
- If you are bothered by this possibility, you can specify the +L
- switch, which causes A86 to pessimistically generate the three
- byte JMP for all forward references, unless specifically told not
- to with SHORT.
-
- NOTE that LONG will have effect only on the operand to an
- unconditional JMP instruction; not to conditional jumps. This is
- because the conditional jumps don't have 3-byte forms; the only
- conditional jumps are short ones. If you run into this problem,
- then chances are your code is getting out of control--time to
- rearrange, or to break off some of the intervening code into
- separate procedures. If you insist upon leaving the code intact,
- you can replace the conditional jump with an "IF cond JMP".
-
-
- OFFSET operator
-
- Syntax: OFFSET var-name
-
- OFFSET is used to convert a variable into the constant pointer to
- the variable. For example, if you have declared XX DW ?, and
- you want to load SI with the pointer to the variable XX, you can
- code: MOV SI,OFFSET XX. The simpler instruction MOV SI,XX moves
- the variable contents of XX into SI, not the constant pointer to
- XX.
-
-
- NEAR Operator
-
- Syntax: NEAR operand
-
- NEAR converts the operand to have the type of a code label, as if
- it were defined by appearing at the beginning of a program line
- with a colon after it. NEAR is provided mainly for compatibility
- with Intel/IBM assemblers.
- 8-10
- Square Brackets Operator
-
- Syntax: [operand]
-
- Square brackets around an operand give the operand a memory-
- variable type. Square brackets are generally used to enclose the
- names of base and index registers: BX, BP, SI, and DI. When the
- size of the memory variable can be deduced from the context of
- the expression, they are also used to turn numeric constants into
- memory variables. Examples:
-
- MOV B[BX+50],047 ; move imm. value 047 into memory byte at BX+50
- MOV AL,[050] ; move byte at memory location 050 into AL
- MOV AL,050 ; move immediate value 050 into AL
-
-
- Colon Operator
-
- Syntax: constant:operand
- segreg:operand
-
- The colon operator is used to attach a segment-register value to
- an operand. The segment-register value appears to the left of
- the colon; the rest of the operand appears to the right of the
- colon.
-
- There are two forms to the colon operator. The first form has a
- constant as the segment-register value. This form is used to
- create an operand to a long (inter-segment) JMP or CALL
- instruction. An example of this is the instruction JMP 0FFFF:0,
- which jumps to the cold-boot reset location of the 86 processor.
-
- The only context other than JMP or CALL in which this first form
- is legal, is as the operand to a DD directive or an EQU
- directive. The EQU case has a further restriction: the offset
- (the part to the right of the colon) must have a value less than
- 256. This is because there simply isn't room in a symbol-table
- entry for a segment-register value AND a 2-byte offset. I don't
- think you will be hurt by this restriction, since references to
- other segments are usually to jump-tables at the beginning of
- those segments.
-
- The second form has a segment register name to the left of the
- colon. This is the segment-override form, provided for
- compatibility with Intel/IBM assemblers. A86 will generate a
- segment-override byte when it sees this form, unless the operand
- to the right of the colon already has a default segment register
- that is the same as the given override.
-
- I prefer the more explicit method of overrides, exclusive to A86:
- simply place the segment register name before the instruction
- mnemonic. For example, I prefer ES MOV AL,[BX] to MOV
- AL,ES:[BX].
- 8-11
- ST Operator
-
- ST is ignored whenever it occurs in an expression. It is
- provided for compatibility with Intel and IBM assemblers.
- For example, you can code FLD ST(0),ST(1), which will be taken by
- A86 as FLD 0,1.
-
-
- TYPE Operator
-
- Syntax: TYPE operand
-
- The TYPE operator returns 1 if the operand is a byte variable; 2
- if the operand is a word variable; 4 if the operand is a
- doubleword variable; 8 if the operand is a quadword variable; 10
- if the operand is a ten-byte variable; and the number of bytes
- allocated by the structure if the operand is a structure name.
-
-
- THIS and $ Specifiers
-
- THIS returns the value of the current location counter. It is
- provided for compatibility with Intel/IBM assemblers. The
- dollar-sign $ is the more standard and familiar specifier for
- this purpose; it is equivalent to THIS NEAR. THIS is typically
- used with the BYTE and WORD specifiers to create alternate-typed
- symbols at the same memory location:
-
- BVAR EQU THIS BYTE
- WVAR DW ?
-
- I don't recommend the use of THIS. If you wish to retain Intel-
- compatibility, you can use the less-verbose LABEL directive:
-
- BVAR LABEL BYTE
- WVAR DW ?
-
- If you are not concerned with compatibility to lesser assemblers,
- A86 offers a variety of less-verbose forms. The most concise is
- DB without an operand:
-
- BVAR DB
- WVAR DW ?
-
- If this is too cryptic for you, there is always BVAR EQU B[$].
- 8-12
- Operator Precedence
- -------- ----------
-
- Consider the expression 1 + 2 * 3. When A86 sees this
- expression, it could perform the multiplication first, giving an
- answer of 1+6 = 7; or it could do the addition first, giving an
- answer of 3*3 = 9. In fact, A86 does the multiplication first,
- because A86 assigns a higher precedence to multiplication than it
- does addition.
-
- The following list specifies the order of precedence A86 assigns
- to expression operators. All expressions are evaluated from left
- to right following the precedence rules. You may override this
- order of evaluation and precedence through the use of parentheses
- (). In the example above, you could override the precedence by
- parenthesizing the addition: (1+2) * 3.
-
- Some symbols that we have referred to as operators, are treated
- by the assembler as operands having built-in values. These
- include B, W, F, $, and ST.
-
- If two operators are adjacent, the rightmost operator must have
- precedence; otherwise, parentheses must be used.
-
- ---Highest Precedence---
-
- 1. Parenthesized expressions
- 2. Period, colon for segment-override
- 3. OFFSET, TYPE, and PTR
- 4. HIGH, LOW, and BIT
- 5. Multiplication and division: *, /, MOD, SHR, SHL
- 6. Addition and subtraction: +,-
- a. unary
- b. binary
- 7. Relational: EQ, NE, LT, LE, GT, GE
- 8. Logical NOT
- 9. Logical AND
- 10. Logical OR and XOR
- 11. Colon for long pointer, SHORT, LONG, and BY
- 12. DUP
-
- ---Lowest Precedence---
-
-